Skip to content

Get plot data for prepostfit experiments #438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Apr 17, 2025
Merged

Conversation

lpoug
Copy link
Contributor

@lpoug lpoug commented Feb 7, 2025


Note: First time doing a proper PR so not sure if all pre-requisites are here.


📚 Documentation preview 📚: https://causalpy--438.org.readthedocs.build/en/438/

@lpoug lpoug marked this pull request as ready for review February 17, 2025 08:44
@drbenvincent
Copy link
Collaborator

Humble apologies for taking so long to getting to this PR @lpoug. I've unfortunately not had much time to spend on CausalPy as I'd have liked, but hoping to catch up with the backlog.

There are currently a couple of issues with the remote checks. I'm hoping to get these resolved in #437, at which point I'll test this out locally and give feedback if necessary before we can merge this :)

Copy link

codecov bot commented Feb 27, 2025

Codecov Report

Attention: Patch coverage is 96.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 94.53%. Comparing base (2a6f9db) to head (da6c91d).
Report is 39 commits behind head on main.

Files with missing lines Patch % Lines
causalpy/experiments/base.py 84.21% 3 Missing ⚠️
causalpy/experiments/prepostfit.py 97.05% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #438      +/-   ##
==========================================
+ Coverage   94.40%   94.53%   +0.12%     
==========================================
  Files          31       31              
  Lines        1985     2068      +83     
==========================================
+ Hits         1874     1955      +81     
- Misses        111      113       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lpoug
Copy link
Contributor Author

lpoug commented Mar 3, 2025

Humble apologies for taking so long to getting to this PR @lpoug. I've unfortunately not had much time to spend on CausalPy as I'd have liked, but hoping to catch up with the backlog.

There are currently a couple of issues with the remote checks. I'm hoping to get these resolved in #437, at which point I'll test this out locally and give feedback if necessary before we can merge this :)

Absolutely no problem whatsoever @drbenvincent! Let me know when the time comes, I'll be around 😄

@drbenvincent drbenvincent added the enhancement New feature or request label Mar 3, 2025
@drbenvincent
Copy link
Collaborator

Hi @lpoug. I pushed some changes, can you make sure to pull the latest version?

I'll try to review this in the next few days :)

@lpoug
Copy link
Contributor Author

lpoug commented Apr 1, 2025

Hey there @drbenvincent. Just to be sure, are you waiting on anything on my side?

@drbenvincent
Copy link
Collaborator

Apologies for the delay! Just dropping in some review comments now

Copy link
Collaborator

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the slow review on this. My bad.

Overall this looks good. I've suggested some minor changes. Other than that the main thing is to update the tests to ensure this functionality remains working into the future.

Could you add new tests to test_integration_pymc_examples.py and test_integration_skl_examples.py. I imagine we can just test that we successfully get back a dataframe from calling result.get_plot_data on the experiments that you've implemented so far. You could optionally test that the contents of that dataframe is as expected, e.g. has the desired columns.

In theory an ultra pedantic person might want to test that we get an exception when calling the get_plot_data on experiments that.

Because this PR involves additional methods, can you run make uml. This should update the UML diagram that we include in CONTRIBUTING.md

Sorry again about the latency on this review.

@lpoug
Copy link
Contributor Author

lpoug commented Apr 7, 2025

Sorry about the slow review on this. My bad.

Overall this looks good. I've suggested some minor changes. Other than that the main thing is to update the tests to ensure this functionality remains working into the future.

Could you add new tests to test_integration_pymc_examples.py and test_integration_skl_examples.py. I imagine we can just test that we successfully get back a dataframe from calling result.get_plot_data on the experiments that you've implemented so far. You could optionally test that the contents of that dataframe is as expected, e.g. has the desired columns.

In theory an ultra pedantic person might want to test that we get an exception when calling the get_plot_data on experiments that.

Because this PR involves additional methods, can you run make uml. This should update the UML diagram that we include in CONTRIBUTING.md

Sorry again about the latency on this review.

No problem at all! I was just starting to get worried about something that still needed to be done on my end 😅

Thank you for the reviews. I've added links to the commits directly in your comments. 6a6face (renaming of functions)

Regarding tests, I have added them in 97f0d79

I have not added anything yet to test that we get an exception when calling the get_plot_data on experiments for which it is not implemented. I'll try to take a moment to think about how to do so precisely.

Finally, I have updated the diagrams in 0edca77

Let me know if these changes look good or not or if you had anything else in mind!

Copy link
Collaborator

@drbenvincent drbenvincent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I think we are very nearly there :) Thanks for adding in the tests.

It could be prudent to rename the *_hdi_lower and *_hdi_upper columns to include the numerical hdi_prob as a percentage. For example if hdi_prob=0.8 then the columns could be labelled as *_hdi_lower_80 and *_hdi_upper_80. That way there is much less scope to make a mistake like generating 80% HDI's but forgetting that and thinking that you generated 95% HDI's for example. I think that should be pretty simple to do in _get_plot_data_bayesian or _get_plot_data_ols.

Should also be pretty simple to resolve the merge conflicts - it's just the updated uml images as far as I can see.

Could you also update the docstring of the tests? At the moment they list out what is tested, so you can just flag up that it tests the functionality of plot_data. I'm not sure we'll carry on doing that in the future if the number of tests gets large, but let's keep up with it for the moment.

…ted tests accordingly, and updated tests' docstring
@lpoug
Copy link
Contributor Author

lpoug commented Apr 8, 2025

I made the changes for dynamic naming of *_hdi_upper and *_hdi_lower and adjusted the tests accordingly (please see my comment above regarding tests). I also updated the docstrings of the tests as requested.
See commit 44d3870

Regarding the merge conflicts, to be honest I'm not sure what I need to do on my side? Could you enlighten me, please?

Thanks!

@drbenvincent
Copy link
Collaborator

Hi @lpoug. I've sorted the conflicting files and done a few small things. I'm just noticing that at the moment the rendered docs obscure the allowable args/kwargs. So we have this:

Screenshot 2025-04-16 at 10 08 06 Which doesn't give the user much hint of what they can pass.

Same situation if we look at an actual experiment class:
Screenshot 2025-04-16 at 10 08 51

I will have a quick play with this to see if we can expose the parameter information to the user in the docs. I think the easiest way is to revert a previous suggestion and make _get_plot_data_bayesian and _get_plot_data_ols public methods again. I experiment and make a commit shortly.

@drbenvincent
Copy link
Collaborator

So now the docs expose the kwargs to the user. It's not ultra clean because we wanted to hide the bayesian vs ols functions, but at least this way the user can find out what (if any) kwargs they can pass. Here are some examples:

Screenshot 2025-04-16 at 10 28 10

Those functions are clickable, which takes you to this, for example

Screenshot 2025-04-16 at 10 28 25

And the arviz.hdi is also clickable and takes you through to the external docs.

@drbenvincent
Copy link
Collaborator

drbenvincent commented Apr 16, 2025

Any other final changes @lpoug, or are you happy to merge this now?

Looks like there might just be a quick test coverage issue to deal with, but I'm sure we can get that done between us.

@drbenvincent drbenvincent self-requested a review April 17, 2025 08:09
@drbenvincent
Copy link
Collaborator

Success with code coverage and all tests passing. Hope you don't mind that I carried this over the finish like @lpoug. Sorry the process was a little slow, it will be faster on your next PR 😀

@drbenvincent drbenvincent merged commit 7e0ca34 into pymc-labs:main Apr 17, 2025
8 checks passed
@lpoug
Copy link
Contributor Author

lpoug commented Apr 18, 2025

Amazing thank you @drbenvincent! Glad to have been a part of this 😄 Looking forward for the next, even more challenging one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get model results data
2 participants